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Abstract 

Algorithms  that  attempt  to  accurately  compute  optical  flow  must  cope  with 
occlusion,  brightness  changes,  irregular  motion,  and  the  aperture  problem.  Most 
optical  flow  algorithms  have  tried  to  overcome  one  or  more  of  the  problems 
mentioned  above.  Yet,  claims  about  the  reliability  of  computed  flow  are  still  based 
on  ad  hoc  evaluation  schemes.  While  a perfect  algorithm  that  is  free  of  all  the 
problems  is  not  yet  available,  we  present  a reliable  algorithm,  which  despite  all 
the  difficulties,  generates  the  output  flow  field  wherever  possible,  and  associates 
with  the  output  evaluation  metrics  that  reflect  the  reliability  of  the  output.  The 
evaluation  metrics  are  complete  in  the  sense  that  they  are  theoretically  related  to 
the  physical  phenomena  that  cause  the  inherent  problems  noted  above.  Our 
approach  to  computing  optical  flow  expands  the  spatio-temporal  image  in  terms  of 
Hermite  polynomials  and  then  derives  multiple  Gaussian  smoothed  gradient 
constraint  equations,  which  constitute  an  overdeteimined  linear  system  that  can  be 
solved  for  image  flow  with  a least  square  method.  Using  the  QR  decomposition 
technique,  we  extract  the  residual,  condition  number,  and  the  determinant 
associated  with  the  linear  system.  These  measures  are  shown  to  correspond  to  the 
severity  of  the  aforementioned  problems.  By  thresholding  the  output  using  these 
evaluation  metrics,  we  extract  different  densities  of  output  that  can  be  compared 
with  other  algorithms  included  in  the  evaluation  scheme  developed  by  Barron, 
Fleet,  and  Beauchmin.  We  show  that  our  algorithm  performs  consistently  better 
over  a wide  variety  of  synthetic  and  real  image  sequences. 


1.  Introduction 


During  the  past  decade,  a great  deal  of  effort  has  been  spent  on  the  computation  of  optical  flow. 
Three  methodologies  have  been  explored:  spatio-temporal  energy  based[l,12,13,23],  correlation 
based[3,20,25],  and  gradient  based  methods[6,14, 18, 19,10].  There  is  a great  deal  of  biological 
motivation  for  spatio-temporal  approaches.  These  techniques  are  based  on  the  equation 


0)  = CO  • M + CO  • V 
t X y 


(1) 


where  (u,v)  is  the  velocity  of  a point  in  the  image,  and  co  , co  , co  are  spatial  and  temporal  fre- 

X y t 


quencies  at  the  point.  The  local  image  velocity  (m,v)  can  be  estimated  by  determining  the  plane  of 
nonzero  spectrum  in  the  spatio-temporal  frequency  domain[12].  The  correlation-based  method 
typically  finds  prominent  features  or  small  patches  of  the  image  and  attempts  to  match  them  in 
successive  images.  Most  researchers  have  used  three  strategies  to  match  the  candidate  features  or 
patches:  maximizing  the  correlation  response  measurement[20],  minimizing  the  sum-of-squared 
difference  of  the  corresponding  values[2],  and  coarse-to-fine  search  strategies[25].  The  gradient- 
based  techniques  extract  optical  flow  from  spatial  and  temporal  derivatives  of  the  image  intensity. 
In  one  approach[14],  the  first  derivatives  and  the  gradient  constraint  equation  are  used: 


I^u  + IyV  + I,  = 0 


(2) 


where  /^,  ly  denote  the  partial  derivatives  of  /,  the  image  intensity,  with  respect  to  x,  y,  t re- 
spectively. The  two  components  (m,  v)  of  the  image  velocity  are  constrained  by  only  one  linear 
equation  in  (2).  In  another  approach[19],  second  order  derivatives  are  used  to  constrain  image  ve- 
locity: 


(3) 


Clearly,  {u,  v)  can  be  recovered  from  (3)  wherever  they  are  linearly  independent.  Because  of  the 
sensitivity  of  numerical  differentiation  to  noise,  velocity  estimates  from  second  derivatives  are 
not  very  accurate.  In  [10],  Haralick  and  Lee  present  a facet  model  approach  to  obtain  image  ve- 
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locity  based  on  fitting  a function  to  the  intensities  in  local  neighborhoods  of  spatio-temporal  im- 
ages. However,  the  approach  requires  first  and  second  order  partial  derivatives  in  space  and  time. 
A set  of  (5x5x5)  operators  were  used  for  estimating  the  derivatives.  The  method  is  very  sensitive 
to  noise. 

Our  approach  is  similar  to  the  facet  model  method  in  that  both  estimate  a local  neighborhood  of 
the  spatio-temporal  image  with  a polynomial.  The  difference  is  in  the  use  of  an  orthogonal  Her- 
mite  polynomial  basis  to  estimate  the  derivatives  in  an  image  neighborhood.  Not  only  is  our 
approach  based  on  more  numerically  stable  estimation  techniques  but  in  addition,  as  the  behavior 
of  Hermite  polynomials  is  modeled  by  Gaussian  derivatives,  our  gradient  constraint  equations  are 
more  robust.  Numerous  physiological  experiments  [9,16,26]  support  the  theory  that  the  visual 
receptive  field  can  be  modeled  by  Gaussian  derivatives  of  various  widths.  Indeed,  Gaussian  deriv- 
atives are  linear,  spatial  shift  and  scale  invariant,  isotropic  and  insensitive  to  noise.  In  addition, 
they  are  rotation  and  dimension  invariant. 

We  categorize  our  algorithm  as  a gradient-based  method.  The  major  advantages  over  the  previous 
gradient  based  methods  are  numerical  stability,  characterization  of  the  performance  of  the  algo- 
rithm, and  simplicity.  Numerical  stability  is  achieved  by  using  Hermite  polynomials.  The  perfor- 
mance of  the  algorithm  is  evaluated  using  the  residual,  condition  number,  and  the  determinant 
obtained  from  an  overextended  linear  system  of  gradient  constraint  equations.  Simplicity  is  due  to 
the  fact  that  the  filtering  process  accomplishes  smoothing  and  differentiation  simultaneously.  The 
computations  involve  only  convolutions  and  solution  of  linear  systems. 

As  our  extensive  experiments  attest,  our  algorithm  performs  consistently  well  over  a large  variety 
of  images.  Furthermore,  the  paradigm  presented  here  is  justified  by  its  power  to  unify  and  predict 
theoretical  relationships  between  image  phenomena  and  computed  flow  field.  To  the  best  our 
knowledge,  only  Kearney  et  al.[15]  has  exploited  such  a relationship  as  an  integral  part  of  the  al- 
gorithm. 

The  organization  of  this  paper  is  as  follows.  We  lay  out  the  theoretical  foundation  in  Section  2. 
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Implementation  details  including  the  discussion  of  confidence  measures  are  presented  in  Section 
3 followed  by  extensive  experiments  and  comparisons  in  Section  4.  Section  5 concludes  the 
paper.  Analytical  results  about  the  interpretation  of  confidence  measures  are  aggregated  in  appen- 
dices. 


2.  Algorithm 

In  this  section,  we  establish  gradient  constraint  equations  similar  to  (2)  and  (3)  and  expand  them 
to  higher  order.  The  difference  here  is  that  the  gradients  are  computed  by  Gaussian  derivatives. 
The  algorithm  estimates  optical  flow  by  fitting  a three  dimensional  polynomial  to  a spatio-tempo- 
ral image  sequence.  The  model  is  defined  as  follows: 

I {x,  y,t)  = F {x-ut,  y-vt)  (4) 

Equation  (4)  is  based  on  the  assumption  that  the  motion  in  the  image  is  constant  over  time  and  the 
image  brighmess  pattern  does  not  change  over  time.  Using  the  Hermite  polynomial  basis  to  ex- 
pand the  function  / and  F,  optical  flow  (u,  v)  can  be  estimated.  The  algorithm  consists  of  two  stag- 
es: 1,  convolution  of  the  image  / with  a set  of  derivative  masks  generated  by  Hermite  polynomials 
and  2,  least  square  error  solution  of  an  overdetermined  linear  system. 


2.1  Hermite  Polynomials 

The  nth  Hermite  polynomial  (j:)  is  a solution  of 


d dH„ 

"_2;c— " + 2«//„  = 0 

dx^ 

The  (x)  are  derived  by  Rodrigues’  formula  [11] 


2 2 

H„{x)  = (-1)  e — ~e  . 

dx 

It  is  clear  from  (6)  that 

(x)  e where  is  the  set  of  polynomials  of  degree  n. 


(5) 


(6) 
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The  computation  of  (jc)  is  especially  easy  due  to  the  following  recursive  relations: 


= 2xH^(x)-2nH^_,{x) 
HqM  = 1 
//j  (x)  = 2x 

The  orthogonality  of  [H^  {x)  ] can  be  verified  from  (5)  and  (6). 


By  substituting  G(x)  for  e in  (6),  we  generalize  to  Hermite  polynomials  with  respect  to  the  Gaus- 
sian function.  Let  these  Hermite  polynomials  be  denoted  by  //„  {x) 


H„{x)  = (-1)"G‘‘w-L(GW) 

dx 

Note  that  H„  (x)  differs  from  (x)  by  a scaling  product: 


(7) 


H„(x)  = 


2 O 


H 


"I  ,l/2_ 

2 O 


(8) 


where  a is  the  standard  deviation  of  G(x). 

The  scalar  product  of  two  functions  and  the  L2-norm  of  a function  with  G(x)  as  a weight  function 
are  defined  as  follows: 


{a,  b)  = j G{x)a  (x)  b {x)  dx 

— oo 

The  orthogonality  of  {7?„  (;c)  } can  be  expressed  in  the  following  way: 


where 


rs 

mn 


\ <^m  = n 
0 <=>m^n 


-2n  , 5 

a «!o 


mn 


(9) 


The  3D  case  of  Hermite  polynomials  is  especially  simple  because  they  are  separable  with  respect 
to  dimension.  Thus  the  polynomial  with  order  n = i + j + kis: 
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Hijk{x,y,t)  = Hi{x)  ^Hj(y) 


(10) 


2.2  Derivation  of  Gradient  Constraint  Equations 

One  of  the  most  important  properties  of  Hermite  polynomials  is  the  property  of  Gaussian  deriva- 
tives. It  is  with  this  property  that  we  are  able  to  establish  gradient  constraint  equations.  This 
property  is  manifested  in  the  following  theorem. 

Theorem  1:  A one  dimensional  signal  I(x)  can  be  expanded  in  terms  of  Hermite  polynomials  as 


k = 0 


then/.  = (/,//t)  = where //qW  = 1 and/W  = 


k 

dl 

dx^ 


(11) 


The  proof  is  given  in  Appendix  A. 

From  Theorem  1,  we  know  that  [Ij^]  can  be  interpreted  in  two  ways:  the  expansion  coefficients 
and  Gaussian  derivatives  of  the  image  sequence. 

Consider  our  motion  model: 


I {x,  y,t)  = F {x-ut,  y-vt) 
Now  expand  both  sides  with  Hermite  polynomials, 


Hijk 


oo  oo  oo 

XXX  112 

{ = 0y  = 0yk=0 


OO  OO  OO 


= XXX^. 


H 


ijk 


where 


' lijk  = 


(12) 


Suppose  the  scene  is  smooth  and  the  motion  is  steady,  which  means  that  the  higher  order  terms  in 
the  expansion  vanish.  We  can  then  use  a limited  basis  to  represent  the  function.  If  only 


Hqqq,  Hiqq,  Hqiq,  Hqqi  are  used,  we  have  the  following: 


^000  “ ^000 

^100  “ ^100  (13) 

^010  “ -^010 
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BF  - 


BF 


BF  - 


^001  “ ^001  = (^»-^00l)  = (~»-^OOo)  = ( (-^)-5T  (-■^)37’'^OOo) 


001 


Bt 


Bx 


By' 


— ?F  — 

= (-W)  <g,  Hooo)  + (-V)  Hooo) 


= (-m)<F,//ioo>+  (-v)(F,//oio> 

= (-W)Fjqq+  (-V)Fqjq 

The  derivation  of  (14)  is  based  on  Theorem  1.  From  (13)  and  (14),  we  derive  the  first  order  gradi- 
ent constraint  equation. 

/qoi  + M^ioo  + ^^010  “ ^ 

If  a larger  basis  is  used,  we  can  have  additional  gradient  constraint  equations.  For  example,  with 

^000»^100»^010»^001»^101>^011»^110’^200>^020»  following  similar  reasoning,  we  have  the 
second  order  gradient  constraint  equations: 


/lOi  + M/200  ^^110  “ ^ 

/qu  + m/jjq  + v/q2o  = 0 


Equations  in  (15)  and  (16)  can  be  rewritten  in  a matrix  form: 


(16) 


^100 

^010 

^001 

0 

^200 

^110 

M 

V 

+ 

^101 

= 

0 

/no 

^020 

y 

/oil 

_0 

(17) 


Realizing  that  } is  just  the  Gaussian  derivatives  of  /,  we  note  the  similarity  between  (15),  (16) 
and  (2),  (3),  rewritten  here: 

l,+l,U+lyV  = 0 

+ + V = ® 

If  the  assumption  of  smooth  scene  and  steady  motion  are  not  practical,  we  may  use  a larger  poly- 
nomial basis  and  thus  generate  additional  gradient  constraint  equations,  while  still  maintaining  a 
linear  system  of  equations  in  u and  v.  More  generally. 
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h,j, k + 1, ifc- 1 “ ^ 


(18) 


In  fact,  we  can  use  as  many  equations  as  necessary  for  a particular  image  scene  and  motion. 


Another  approach  is  to  use  a nonlinear  system  to  achieve  higher  accuracy.  For  example,  we  could 
add  ^002  previous  second  order  basis.  The  additional  equation  will  be 


^002  (^200^^  ■^^020^^) 

The  equation  is  not  linear,  so  it  might  be  necessary  to  use  an  iterative  algorithm  to  search  for  the 
optimal  solution. 


3.  Implementation 


3.1  Localizing  Computation 


Recall  that  our  motion  model  assumes  that  motion  in  the  image  is  constant  over  time.  To  assure 
this  is  not  violated,  we  should  only  use  a small  local  neighborhood  to  compute  optical  flow  for 
each  pixel.  In  addition,  we  have  to  adapt  the  algorithm  to  the  discrete  case  as  required  for  the 


common  image  input.  Therefore,  we  define  local  expansion  coefficients  {x,  y,  t)  } to  approx- 
imate {/.y^ } and  redefine  the  inner  product  as  a local  computation  in  V: 


hjk  (x,  y,  t)  = {I  (x,  y,  t) , Hiji) 

= + ^0’  3”  + ^ + ffl)  ^ijk  (^0’  yO’  ^o)  ^ (^0’  ^0’  'o) 

V 

(20) 


Substituting  {1;^^  (x,  y,  t)  ) for  {/-^ ) in  (17),  we  have  an  overextended  linear  system: 


2ooi  (x,y,0 

0 

M(x,y,f) 

-1- 

2ioi  (x,y,t) 

~ 

0 

[v  (x,  y,  r)  J 

n 

/oii(x,y,  0_ 

u 

(21) 


hoo(x,y,t)  Iiio(x,y,t) 
f no  (x,y,t)  i 020  (-x,y,t) 

Note  that  we  have  reduced  global  computation  in  (17)  to  local  computation  in  (21).  Now  for  every 
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single  image  pixel,  there  is  an  associated  linear  system  that  can  be  solved  for  optical  flow. 


Let  ^ 


^010  (x,y,t) 
hooix,y,t)  I^iQ{x,y,t) 
lQ2Qix,y,t) 


J = 


u (x,  y,  t) 
y(x,  y,  t)_ 


,b  = W 


/ooi  (x,  y,  t) 
hoi  (x,  y,  0 


(22) 


We  introduce  a weight  matrix  W to  compensate  the  difference  of  norms  between  first  and  second 
order  Hermite  polynomials, 


W = 


0 0 

0 ^2  0 , 

0 0 W2 


(23) 


Since  {/^-^  (jc, y,  t)  } is  just  a local  approximation  of  [hji^],  we  solve  the  overextended  linear  sys- 
tem for/  (or  u and  v)  in  the  least  square  error  sense: 

E = min\\Af+  b\\ , (24) 

Using  the  QR  decomposition  method  [Householder  Triangularization],  we  derive  the  following: 


A = QR,  and  E = min\\QRf+b\\  = m/n||R/-i- , where  Q is  unitary.  (25) 


d 

R, 

R is  an  upper  triangular  matrix,  denoted  by 

0 

^2 

and 

S 

0 0 

[0 

oj 

where  R^  is  the  upper  square  matrix. 


Let  Q^b  be 


where  b^  is  the  upper  2-element  vector.  Equation  (25)  becomes 


E = min(\\R/+bJI^  +r) 

= r if  R^  is  not  singular.  (27) 

The  solution  is  computed  from  = 0 (28) 

In  the  actual  implementation,  we  use  a floating  point  computation;  as  a result,  R^  is  rarely  singu- 
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lar.  However,  the  behavior  of  determines  the  accuracy  of  the  solution.  Also  the  residual  r of 
the  overextended  linear  system  plays  an  important  role  in  determining  the  accuracy  of  the  solu- 
tion. We  devote  the  next  subsection  to  the  discussion  of  accuracy  of  the  computed  optical  flow 
and  associated  confidence  measures. 

3.2  Confidence  Measures 

Optical  flow  computed  from  noisy  data  is  unreliable  and  there  should  be  a mechanism  to  reject 
unreliable  or  poor  flow  estimates.  The  confidence  measure  is  defined  as  a quantitative  value  that 
indicates  the  degree  of  confidence  in  the  quality  of  the  computed  result.  Our  algorithm  provides 
ample  information  about  the  behavior  of  the  system  equations.  It  is  then  shown  that  this  informa- 
tion signifies  certain  image  phenomena,  e.g.,  occlusion,  which  present  difficulties  for  optical  flow 
computation.  Therefore,  it  can  be  utilized  as  confidence  measures.  We  first  analyze  some  potential 
confidence  measures  in  the  following  subsections. 

3.2.1  Residual 

The  residual  of  our  algorithm  is  \\Af+  b\\  or  r {=E)  (27).  The  residual  of  an  overextended  linear 
system  indicates  the  degree  to  which  the  equations  disagree  with  one  another.  Recalling  (15)and 
(16),  if  [I ) are  computed  exactly,  these  equations  should  hold.  The  reason  for  the  existence  of 

residual  lies  in  the  approximation  error  of  (jc,y,  0 )•  A high  approximation  error  may  indi- 
cate one  of  three  problems: 

1.  The  assumption  of  constant  motion  is  violated  in  the  window  V.  It  is  possible  that  the 
window  covers  more  than  one  moving  object.  Occlusion  and  multiple  independently 
moving  objects  in  a window  can  cause  this  problem. 

2.  The  assumption  of  constant  image  brightness  is  violated.  It  is  not  unusual  for  the 
brightness  of  an  object  to  change  when  the  viewing  angle  changes  due  to  relative 
motion.  In  addition,  the  observing  camera  may  adjust  the  picture  balance  for  different 
scenes,  resulting  in  change  of  object  brightness.  Similar  results  can  be  caused  by  the 
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shadow  of  a moving  object. 


3.  Truncation  error.  Truncation  errors  are  introduced  when  we  use  a small  window  to 
compute  [lijk  {x,  y,t)  }.  Within  the  small  window,  the  Hermite  polynomials  are  no 
longer  orthogonal  and  the  expansion  coefficients  are  not  accurate. 


We  can  model  the  above  errors  as  perturbations  to  the  linear  system  (Appendix  B): 

E = mini  + ^)f  + (tf  + Ab)  || , where  N and  Ab  denote  errors. 
We  prove  in  Appendix  B the  following  analytical  results: 

A/  = /-/=  -( (N/+  Ab) 


(29) 


r = 


I-A\A^a]  ^A^  ](Nf+Ab) 


(30) 

(31) 


Note  that  the  expressions  of  both  optical  flow  error  A/ (30)  and  residual  r (31)  contain  the  error 
vector  (A/+  Ab)  . Even  so,  it  may  be  deceptive  to  claim  that  the  residual  is  proportional  to  the 

{ T T 

optical  flow  error  because  the  error  vector  is  mapped  by  different  matrices  (I  A A I A , 


T . Vi ,T. 


-1 


T . 


I-A\^A  A j A ).  We  further  show  that  the  matrix  I -A\^A  A j A in  the  residual  expression 
has  only  one  nontrivial  eigenvalue,  which  happens  to  be  1.  This  makes  it  more  likely  for  the  error 
vector  to  be  mapped  to  a small  vector.  Nonetheless,  it  is  clear  that  a large  residual  certainly  means 
there  are  errors  and  the  optical  flow  result  may  be  inaccurate. 


Note  that  the  three  problems  mentioned  above  suggest  contradictory  choices  for  the  window  size. 
With  larger  windows,  problems  1 and  2 may  be  aggravated;  with  smaller  windows,  problem  3 
may  become  intolerable. 


Using  |l/r|  as  a confidence  measure,  we  can  locate  these  problems  and  eliminate  unreliable 
results. 


3.2.2  Condition  Number 

The  condition  number  of  , denoted  by  k (R^)  , is  defined  as  ||  ||^  J^||  and  can  be  shown  to  be 
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, where  X^-„  are  eigenvalues  of  . 

1^1  mm 

A condition  number  measures  the  extent  to  which  a linear  system  maps  the  input  error  into  output 
error,  or  in  brief,  the  numerical  instability  of  the  system.  The  higher  the  condition  number,  the 
more  ill-conditioned  a system  is.  Recalling  (24),  we  regard  b as  the  input  of  the  system  and /the 
output.  There  are  inherent  errors  in  the  elements  of  b.  We  certainly  do  not  consider/ which  con- 
tains errors  magnified  by  an  ill-conditioned  A reliable.  Since  matrix  A is  concerned  with  the  image 
texture  only  and  not  with  motion,  we  find  the  correspondence  between  a high  condition  number 
and  the  following  two  scenarios  of  the  image  neighborhood: 

1.  When  there  is  a steep  edge  in  the  ;c(y)  direction  (Fig  1.1),  so  that  the  first  order  and 
second  order  derivatives  are  very  large  for  x(y)  and  small  for  y(x). 

2.  When  there  is  a lack  of  texture  in  a direction  (Fig  1.2),  so  that  the  derivatives  in  the  x 
direction  are  approximately  proportional  to  the  derivatives  in  the  y direction,  i.e. 
I(x,y)  ^I(kx  + y) . 


Fig  1.1  Smoothed  steep  edge.  Fig  1.2  Lack  of  texture  in  x+y  direction. 

The  above  two  findings  are  easily  verified  from  the  QR  decomposition  process  as  explained  in 
Appendix  D. 

If  there  is  motion  in  the  area  where  one  of  the  two  scenarios  dominate,  then  it  corresponds  to  what 
is  known  as  the  aperture  problem.  Since  we  will  not  be  able  to  recover  the  velocity  accurately 
with  any  local  computation,  we  may  as  well  eliminate  the  results. 

So  1/k  (R^)  can  be  used  as  a confidence  measure. 
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3.2.3  Determinant 


The  determinant  of  is  • ?i2-  solving  (28)  or  f - -R~^  ’ the  determinant  plays  an 
important  role  in  the  matrix  inverse.  Since  we  use  the  QR  decomposition  method,  Q is  unitary 
(orthonormal  projection)  so  the  behavior  of  is  similar  to  the  original  A.  Looking  at  (24),  a 

small  determinant  of  R^  indicates  one  of  the  following  two  scenarios: 

1.  The  two  columns  of  A are  close  to  being  linearly  dependent.  This  is  the  same  as  the 
second  scenario  in  the  discussion  of  condition  number.  In  fact,  we  prove  in 
Appendix  E that  low  determinant  due  to  linear  dependency  also  causes  high  condi- 
tion number. 

2.  All  the  elements  of  A are  very  small.  This  corresponds  to  a uniform  brightness  area, 
e.g.  blue  sky. 

As  noted  before,  the  above  corresponds  to  the  general  case  of  the  aperture  problem.  It  is  interest- 
ing to  note  that  Barron,  Fleet,  and  Beauchemin  [4]  recognize  the  determinant  as  a better  confi- 
dence measure  in  the  application  of  the  liras  et  al.  [22]  optical  flow  algorithm  than  the  condition 
number  as  used  in  the  original  paper.  Our  analysis  using  the  principles  of  linear  algebra  agrees 
with  their  empirical  findings. 

So  we  shall  use  ^det  (R^)  | as  a confidence  measure. 

3.2.4  Integration  of  Confidence  Measures 

Based  on  the  above  analysis,  we  choose  a combination  of  confidence  measures  according  to  the 
nature  of  a given  image  sequence. 

If  the  image  sequence  contains  numerous  moving  objects  or  the  brighmess  changes  significantly, 
residuals  should  be  used  as  confidence  measures.  The  residual  is  unique  in  the  sense  that  it  cap- 
tures the  three  problems  in  3.2.1,  which  no  other  measure  does. 

Condition  number  and  determinant  have  something  in  common  although  they  may  capture  differ- 
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ent  scenarios.  Together  they  signify  the  relationship  between  numerical  instability  and  the  aper- 
ture problem.  We  suggest  the  multiplicative  combination  of  these  two,  det  {R^)  /k  , which 

is  equivalent  to  using  \Mmin'  proposed  by  Gkosi  et  al.[8]  in  a similar  context  and 

was  used  in  Barron’s  implementation  [4]  of  Lucas  and  Kanade’s  optical  flow  algorithm.  In  our  al- 
gorithm, it  simply  means  the  smaller  of  the  derivatives  in  x and  y directions.  We  shall  use  it  to 
capture  the  aperture  problems  and  to  avoid  numerical  instability. 

All  the  above  mentioned  problems  are  not  unique  to  our  algorithm.  In  fact,  many  of  them  are 
common  to  other  optical  flow  algorithms.  However,  we  believe  our  algorithm  is  the  first  serious 
effort  to  capture  most  of  the  problems  in  optical  flow  computation  in  a quantitative  way. 

4.  Experiments 

Based  on  the  work  of  Barron,  Fleet,  and  Beauchemin[4],  we  conducted  extensive  comparisons  of 
our  algorithm  with  other  current  optical  flow  algorithms,  including  ones  by  Horn  and 
Schunck[14],  Lucas  and  Kanade[17],  liras  et  al.  [22],  Nagel[19],  Anandan[2],  Singh[20],  Hee- 
ger[12],  Waxman  et  al.  [24],  Heet  and  Jepson[7].  The  synthetic  image  sequences  we  used  for 
comparison  are  Sinusoid,  Translating  Tree,  Diverging  Tree,  and  Yosemite  Ry-by.  All  of  the  above 
come  with  optical  flow  ground  truth.  The  real  image  sequences  we  used  for  demonstration  are  SRI 
Trees,  Rubik  Cube,  and  Hamburg  Taxi.  All  of  these  images  were  provided  by  J.L.  Barron. 

The  error  statistic  utilized  is  the  angle  error  between  the  computed  optical  flow  time- space  direc- 
tion (Mg,  Vg,  1)  and  the  ground  truth  flow  time- space  direction  (m^,  v^,  1)  averaged  over  the 

whole  image.  Note  that  the  3D  directions  are  different  from  the  intuitive  2D  flow  directions.  It 
takes  into  account  the  magnitude  as  normalized  with  respect  to  time.  Refer  to  [4]  for  more  details. 

We  recognize  the  importance  of  optical  flow  magnitude  for  subsequent  applications,  e.g.  time  to 
contact.  To  separate  magnitude  error  from  angle  error,  we  listed  the  magnitude  error,  defined  as 
the  difference  of  the  magnitudes  of  computed  flow  and  ground  truth  flow,  scaled  by  the  ground 
truth  flow.  But  there  are  no  comparison  data  available. 
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In  order  to  make  extensive  comparisons,  we  implemented  our  algorithm  in  such  a way  that  a 
threshold  on  a specified  confidence  measure  can  be  used  to  eliminate  unreliable  flow  values,  thus 
controlling  the  number  of  points,  or  density,  for  which  computed  flow  field  values  are  available. 
In  all  the  tables  in  the  following  subsections,  the  first  column  is  density,  the  second  and  third  col- 
umns present  the  error  statistics  for  our  algorithm  for  that  density,  and  the  fourth  and  fifth  columns 
present  the  error  statistics  for  a given  comparison  algorithm  for  the  same  density.  The  error  statis- 
tics and  associated  density  for  the  comparison  algorithms  were  obtained  from  Barron  et  al.  [4]. 
For  a single  technique  with  multiple  rows  of  data,  different  threshold  values  are  used  in  the  algo- 
rithm to  produce  multiple  densities  of  output.  For  the  actual  threshold  values  of  the  comparison 
algorithms,  refer  to  Barron  et  al.[4]. 

The  results  show  that  our  algorithm  performs  consistently  well  over  a wide  set  of  images,  if  not 
the  best  in  every  single  sequence.  On  the  other  hand,  most  of  the  other  techniques  included  in 
comparisons  perform  well  only  over  a narrow  set  of  images. 

4.1  Sinusoid 

Sinusoid  is  a synthetic  image  sequence  (Fig  2)  generated  by  a spatial  sinusoidal  wave  traversing 
toward  the  upper  right  side.  For  our  method  we  chose  a window  size  large  enough  (17x17x7  for 
x,y,t)  to  prevent  aliasing.  Fig  3.1  shows  the  true  optical  flow  for  sinusoid,  while  Fig  3.2  shows  the 
flow  computed  with  our  method.  The  output  density  in  Fig  3.2  is  100%.  |l/r|  was  used  as  the 
confidence  measure  in  Table  1.  Our  algorithm  performs  far  better  than  all  of  the  other  algorithms 
except  Fleet  and  Jepson’s.  The  average  magnitude  error  is  1.83%  for  100%  density. 


Fig  2.  Spatial  sinusoid 
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Fig  S.lTnie  optical  flow  for  sinusoid  Fig  3.2  Computed  optical  flow  (100%) 


Table  1:  Summary  of  Sinusoid  Error  Statistics 


Density 

Our  A1 

;orithm 

Other  Algorithm 

Average 

Error 

Standard 

Deviation 

Average 

Error 

Standard 

Deviation 

Technique  by 

100% 

0.65' 

0.25' 

4.19' 

0.50' 

Horn  & Schunck  (original  unthresholded) 

2.55' 

0.59' 

Horn  & Schunck  (modified  unthresholded) 

2.47' 

0.16' 

Lucas  and  Kanade  (unthresholded) 

2.59' 

0.71' 

liras  et  al.  (unthresholded) 

2.55' 

0.93' 

Nagel 

30.80' 

5.45' 

Anandan 

2.24' 

0.02' 

Singh  (step  1 unthresholded) 

0.03' 

o.or 

Fleet  and  Jepson 

12.8% 

0.65' 

0.26' 

64.26' 

26.14' 

Waxman  et  al. 

4.2  Translating  And  Diverging  Tree 

The  translating  and  diverging  tree  sequences  are  two  realistic  synthetic  sequences  simulating  the 
motion  of  simple  translation  and  expansion,  respectively,  of  a poster  (Fig  4).  The  window  size 
used  in  our  method  is  19x19x11  for  translating  tree  and  17x17x9  for  diverging  tree.  Due  to  the 
lack  of  texture  in  some  background  areas,  we  used  ^det  {R^)  | as  the  confidence  measure.  The 

magnitude  error  for  translating  tree  is  131%  for  100%  density  and  1.23%  for  50%  density;  for  di- 
verging tree,  it  is  2.08%  for  100%  density  and  1.51%  for  50%  density.  Fig  5 and  Fig  6 show  the 
results.  The  output  density  for  Fig  5.2  and  Fig  6.2  is  90%.  Error  statistics  are  shown  in  Table  2 
and  Table  3.  Note  that  when  the  density  is  low  (<10%),  the  error  increases.  It  is  because  the  area 
with  high  |der  | corresponds  to  high  contrast  comers  and  may  well  represent  motion  bound- 


16 


aries.  Only  Uras’s  and  Fleet  and  Jepson’s  algorithms  perform  better  than  ours  in  terms  of  average 


error  for  translating  tree.  For  diverging  tree,  our  algorithm’s  performance  is  good  but  not  out- 
standing. It  is  due  to  the  fact  that  the  velocity  variation  is  large  in  a local  window. 


Fig  4.  Translating  and  diverging  tree 


Fig  5.1  True  flow  for  translating  tree 


Fig  6.1  True  flow  for  diverging  tree  Fig  6.2  Computed  flow  for  diverging  tree  (90%) 
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Table  2:  Summary  of  Translating  Tree  Error  Statistics 


Density 

Our  Alj 

gorithm 

Other  Algorithm 

Average 

Error 

Standard 

Deviation 

Average 

Error 

Standard 

Deviation 

Technique  by 

100% 

1.05* 

1.45* 

38.72* 

27.67* 

Horn  & Schunck  (original  unthresholded) 

2.02* 

2.27* 

Horn  & Schunck  (modified  unthresholded) 

0.62* 

0.52* 

Uras  et  al.  (unthresholded) 

2.44* 

3.06* 

Nagel 

4.54* 

3.10* 

Anandan 

1.64* 

2.44* 

Singh  (step  1 unthresholded) 

1.25* 

3.29* 

Singh  (step  2 unthresholded) 

99.6% 

1.04* 

1.42* 

1.11* 

0.89* 

Singh  (step  2) 

74.5% 

0.72* 

0.82* 

0.32* 

0.38* 

Heet  and  Jepson 

53-57% 

0.66* 

0.77* 

32.66* 

24.50* 

Horn  & Schunck  (original) 

5.63* 

2.78* 

Heeger  (level  1) 

1.89* 

2.40* 

Horn  & Schunck  (modified) 

49.7% 

0.65* 

0.75* 

0.23* 

0.19* 

Fleet  and  Jepson 

44.2% 

0.63* 

0.69* 

8.50* 

13.50* 

Heeger  (level  0) 

40-42% 

0.62* 

0.66* 

0.46* 

0.35* 

Uras  et  al. 

0.72* 

0.75* 

Singh  (step  1) 

0.66* 

0.67* 

Lucas  and  Kanade 

26.8% 

0.60* 

0.59* 

0.25* 

0.21* 

Fleet  and  Jepson 

13.1% 

0.58* 

0.51* 

0.56* 

0.58* 

Lucas  and  Kanade 

1.9% 

0.70* 

0.60* 

6.66* 

10.72* 

Waxman  et  al. 

Table  3:  Summary  of  Diverging  Tree  Error  Statistics 


Density 

Our  Al 

gorithm 

Other  Algorithm 

Average 

Error 

Standard 

Deviation 

Average 

Error 

Standard 

Deviation 

Technique  by 

100% 

2.91* 

2.89* 

12.02* 

11.72* 

Horn  & Schunck  (original  unthresholded) 

2.55* 

3.67* 

Horn  & Schunck  (modified  unthresholded) 

4.64* 

3.48* 

Uras  et  al.  (unthresholded) 

2.94* 

3.23* 

Nagel 

7.64* 

4.96* 

Anandan 

17.66* 

14.25* 

Singh  (step  1 unthresholded) 

8.60* 

4.78* 

Singh  (step  2 unthresholded) 

99% 

2.83* 

2.57* 

8.40* 

4.78* 

Singh  (step  2) 

73.8% 

2.43* 

1.97* 

4.95* 

3.09* 

Heeger  (combined) 

60-61% 

2.32* 

1.84* 

0.99* 

0.78* 

Heet  and  Jepson 

8.93* 

7.79* 

Horn  & Schunck  (original) 

3.83* 

2.19* 

Uras  et  al. 

46-48% 

2.26* 

1.84* 

2.50* 

3.89* 

Horn  & Schunck  (modified) 

0.80’ 

0.73* 

Heet  and  Jepson 

1.94* 

2.06* 

Lucas  and  Kanade 

28.2% 

2.29* 

1.92* 

0.73* 

0.46* 

Heet  and  Jepson 
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Table  3:  Summary  of  Diverging  Tree  Error  Statistics 


Density 

Our  A1 

gorithm 

Other  Algorithm 

Average 

Error 

Standard 

Deviation 

Average 

Error 

Standard 

Deviation 

Technique  by 

24.3% 

2.29' 

1.91' 

1.65' 

1.48' 

Lucas  and  Kanade 

4.9% 

2.52' 

2.33' 

13.69' 

11.83' 

Waxman  et  al. 

3.9% 

2.51' 

2.27' 

5.62' 

6.16' 

Singh  (step  1) 

4.3  Yosemite  Fly-by 


The  Yosemite  Fly-by  sequence  is  a synthetic  realistic  image  sequence  (Fig  7).  The  flight  scene  is 
simulated  from  actual  aerial  photos  and  digital-terrain  maps  plus  artificial  sky  and  clouds.  Since 
the  clouds  in  the  sky  change  brightness  over  time,  it  presents  difficulties  for  our  method.  Based  on 
our  previous  analysis,  we  used  |l/r|  as  the  confidence  measure  to  eliminate  those  data  points 
which  correspond  to  a large  blank  area  in  the  sky  and  at  motion  boundaries  in  Fig  8.2.  Since  the 
motion  is  rather  fast  in  some  areas,  we  used  a larger  window  (21x21x7).  Error  statistics  are  shown 
in  Table  4.  The  magnitude  error  is  27.2%  for  100%  density  and  9.27%  for  50%  density.  Again,  the 
clouds  account  for  the  large  magnitude  error.  The  output  density  for  Fig  8.2  is  75%.  Only  Lucas 
and  Kanade’s  algorithm  performs  better  than  ours.  From  this  we  believe  our  algorithm  should 
work  well  with  real  images.  This  will  be  shown  in  the  next  subsection. 


Fig  7.  Yosemite  fly-by  image 
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Fig  8. 1 True  optical  flow  field  for  Yosemite  fly-by  Fig  8.2  Computed  optical  flow  for  Yosemite  fly-by  (75%) 


Table  4:  Summary  of  Yosemite  Fly-by  Error  Statistics 


Density 

Our  A1 

vorithm 

Other  Algorithm 

Average 

Error 

Standard 

Deviation 

Average 

Error 

Standard 

Deviation 

Technique  by 

100% 

10.17* 

13.61* 

32.43* 

30.28* 

Horn  & Schunck  (original  unthresholded) 

11.26* 

16.41* 

Horn  & Schunck  (modified  unthresholded) 

10.44* 

15.00* 

Uras  et  al.  (unthresholded) 

11.71* 

10.59* 

Nagel 

15.84* 

13.46* 

Anandan 

18.24* 

17.02* 

Singh  (step  1 unthresholded) 

13.16* 

12.07* 

Singh  (step  2 unthresholded) 

97.8% 

9.73* 

13.21* 

12.9* 

11.57* 

Singh  (step  2) 

64.2% 

5.14* 

7.67* 

20.89* 

34.26* 

Heeger  (level  0) 

59.6% 

4.95* 

7.42* 

25.41* 

28.14* 

Horn  & Schunck  (original) 

44.8% 

4.59* 

6.90* 

11.74* 

19.04* 

Heeger  (combined) 

33-35% 

4.42* 

6.72* 

4.10* 

9.58* 

Lucas  and  Kanade 

4.29* 

11.24* 

Fleet  and  Jepson 

5.48* 

10.41* 

Horn  & Schunck  (modified) 

30.6% 

4.40* 

6.70* 

4.95* 

12.39* 

Heet  and  Jepson 

15% 

4.34* 

6.57* 

10.51* 

12.11* 

Heeger  (level  1) 

6.74* 

16.01* 

Uras  et  al. 

8.7% 

4.40* 

6.76* 

3.05* 

7.31* 

Lucas  and  Kanade 

7.4% 

4.45* 

6.88* 

20.32* 

20.60* 

Waxman  et  al. 

2.4% 

4.90* 

7.60* 

11.51* 

11.83* 

Heeger  (level  2) 

2.2% 

4.85* 

7.51* 

16.29* 

25.70* 

Singh  (step  1) 
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4.4  Real  Images 

Current  optical  flow  algorithms  often  have  difficulty  with  real  image  sequences.  The  reasons  in- 
clude camera  jitter,  nonrigidity  of  objects,  and  brightness  variations  due  to  changes  in  lighting  or 
adjusted  camera  settings.  Here  we  show  that  our  confidence  measures  can  eliminate  many  of  the 
unreliable  flow  values.  The  optical  flow  output  of  the  following  subsections  has  undergone  thresh- 
olding using  two  confidence  measures,  |l/r|  and 

4.4.1  SRI  Trees 

The  SRI  trees  sequence  (Fig  9.1)  is  a scene  of  stationary  trees  taken  from  a camera  moving  later- 
ally. The  tree  in  the  center  of  the  scene  is  closer  to  the  camera  and  therefore  generates  large  optical 
flow,  as  shown  in  the  computed  optical  flow  field  (Fig  9.2).  The  noise  in  this  sequence  is  signifi- 
cantly larger  than  the  others.  The  output  density  for  this  sequence  is  71%. 


Fig  9.1  SRI  trees  image  Fig  9.2  Computed  optical  flow  field  for  SRI  trees 

4.4.2  Rubik  Cube 


The  Rubik  cube  sequence  (Fig  10.1)  is  generated  by  a stationary  camera  looking  at  a rotating  plat- 
form with  a Rubik  cube  on  it.  Note  that  the  output  in  the  area  without  texture  is  eliminated  by 
thresholding  on  the  confidence  measure  and  the  flow  field  is  well  organized  and  corresponds  to 
moving  objects  (Fig  10.2).  The  output  density  for  this  sequence  is  47%. 
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Fig  10. 1 Rubik  Cube  image  Fig  10.2  Computed  optical  flow  field  for  Rubik  Cube 


4.4.3  Hamburg  Taxis 

The  Hamburg  taxis  sequence  (Fig  11.1)  contains  three  independently  moving  vehicles  in  front  of 
a stationary  background.  Our  algorithm  captures  all  the  three  moving  objects,  as  shown  in  the 
flow  field  (Fig  11.2).  The  output  density  of  the  sequence  is  50%. 


Fig  11.1  Hamburg  taxi  image  Fig  11.2  Computed  optical  flow  for  Hamburg  taxis 


5.  Conclusion 

Some  claim  that  the  optical  flow  problem  is  ill-posed.  We  find  that  as  long  as  the  computation  is 
local  and  robust  numerical  techniques  are  employed  to  provide  sufficient  information  about  the 
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reliability  of  the  result,  then  the  extraction  of  reliable  optical  flow  values  is  possible  and  the  com- 
puted flow  field  should  be  useful  for  other  applications  such  as  obstacle  avoidance,  3-D  scene 
reconstruction,  etc.  The  algorithm  presented  here  is  an  attempt  in  this  direction.  We  have  demon- 
strated that  our  algorithm  compares  very  favorably  with  other  existing  algorithms  and  performs 
consistently  well  over  a variety  of  synthetic  and  real  images. 


Appendix.  A 


We  prove  Theorem  1 as  follows: 

Proof:  The  first  equality  comes  from  the  orthogonality  of  { //„  (jc)  } . We  now  prove  the  second 

equality,  which  claims  that  the  scalar  product  of  a function  and  the  ktti  Hermite  polynomial  is  equal 
to  the  scalar  product  of  the  /:th  derivative  of  the  function  and  1. 


= jG(.x)I(x)H;,(x)dx 


□ 


Appendix.  B 


First,  let  us  model  the  error  due  to  brightness  change.  The  context  (Section  3.2.1)  has  stated  the 


two  possible  reasons  for  brightness  change.  One  is  referred  to  as  reflection;  the  other  is  referred  to 
as  glooming.  Regarding  them  as  additive  noise,  we  model  the  image  intensity  as: 


/ {x,  y,t)  = / U,  y,t)  +N  (x,  y,  t)  (32) 

As  far  as  the  image  sequence  is  concerned,  the  difference  between  reflection  and  glooming  is  that 
reflection  can  be  represented  by  impulse  noise  and  glooming  can  be  represented  by  sustained 
noise  with  respect  to  time  (Fig  12). 


t 

t 

Fig  12.1  Reflection  noise  in  the  time  dimension. 

Fig  12.2  Glooming  noise  in  the  time  dimension. 

So  for  reflection  noise,  assuming  there  is  a single  frame  of  strong  reflection,  we  can  simplify  (32) 
as: 


Hx,  y,  0 

The  expansion  coefficients  are 


I (x, y,t)  t^O 

I {x,  y,t)  +r  (x,  y)  t = 0 


(33) 


hjk  0 = X ^o>  y + Jo’  ^ ^ijk  (^0’  yo’  ^ yo’  ^o) 

y 

= hjk  (.X,  y,t)  +'^^r(x  + XQ,y+  y^)  {Xq,  -t)  G (Xg,  y^,  -t) ) 

^0  ^0  (34) 

Realizing  that  the  zeroth  Hermite  polynomial  Hq  is  an  even  function  and  the  first  Hermite  polyno- 
mial 7?!  is  an  odd  function  (Fig  13),  we  know  that  when  t is  equal  to  0,  the  noise  affects  (0 
more  than  (r)  and  when  t is  not  equal  to  zero  but  small,  then  the  noise  affects  I iji  (r)  more 
than  IijQ  (t) . The  above  statement  will  hold  true  even  with  multiple  frames  of  strong  reflection. 
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Fig  13.1  Hq  {t)G  (t)  is  an  even  function. 


Fig  13.2  Hi  {t)  G (t)  is  an  odd  function. 


Note  that  (22)  indicates  that  b consists  of  /yi  (0  only  and  A consists  of  Iijq  (t)  only.  We  conclude 
here  that  when  a strong  reflection  happens  in  the  middle  frame  of  the  sequence,  the  matrix  A is 
perturbed  more;  otherwise  vector  b is  perturbed  more. 

For  glooming  noise,  assuming  that  from  a certain  point  of  time  on,  the  brightness  change  for  some 
objects,  we  model  the  image  intensity  as  follows: 


(35) 


The  noise  g (Xy  y)  may  well  be  proportional  to  I (x,  y)  in  the  real  world  though. 

Contrary  to  the  case  of  reflection  noise,  from  the  even  and  odd  properties  of  Hermite  polynomials 
and  similar  reasoning  to  that  above,  we  conclude  that  when  r is  equal  to  or  close  to  0,  the  noise 
affects  both  A and  b,  and  for  other  values  of  r,  the  effect  is  negligible.  This  fact  can  be  easily  seen 
by  aligning  Fig  12.1(2)  and  Fig  13.1(2)  correspondingly. 

Now  if  the  source  of  error  is  independently  moving  objects  (objects  A and  B)  in  a small  window 
or  occlusion,  the  image  intensity  is 


^ I{x,y,t)  (x,y,t)sV^ 
Jg(x,y,t)  {x,y,t)eVg’ 


Hx,y,t)  = 


(36) 


where  (V^)  contains  the  image  of  object  A(B).  Suppose  object  A occupies  a larger  area  in  the 
window  and  we  intend  to  estimate  the  velocity  of  A.  Regarding  the  image  of  object  B as  a source 
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of  noise,  we  can  model  7^  (x,  y,  t)  as  an  extension  of  object  A’s  image  I (x,  y,  t)  plus  noise 
r)  . 

For  most  kinds  of  noise,  the  following  model  may  be  adequate: 

E = min\  (A  + N)f+  (b  + Ab)  ||  as  written  in  (29). 

We  conclude  this  appendix  with  a claim  that  the  perturbation  caused  by  multiple  moving  objects 
is  usually  greater  than  that  caused  by  brightness  change.  It  is  often  more  random  and  covers  a 
larger  volume  of  the  3D  spatio-temporal  image. 


Appendix.  C 

Let  A and  b,  defined  in  (22),  contain  no  noise  and  let  the  noise  is  modelled  as  in  (29).  Then, 

E = Af+b  = 0 and  / = - A)~^ A'^ b . (37) 

Let  the  new  optical  flow  be  f and  new  residual  be  E and  assume  that  N « A and  Ab  « b element- 
wise. Then 


7 = -[(A+iV)^(A+N)]-i(A+iV)J’(i>  + A6)  ,and  (38) 

[ (A  + iV)  (A  + N)  ] -1  = (A^A  [/  + (A^A)  "1  (A^iV  + N^A)  ] ) “i 

= [/-  (A^A)-!  (A^N  + NU)]  (A^A)-!  ^ 50  (39) 

7 = - iA'^A)-Wb+  (A^A)-i  {ATN  + NTA)  (A^A)-lA^ft+  {A^A)-^A^b-  {A^A)-^A^Ab. 

Using  (37),  this  can  be  simplified  as  follows: 

/=/-  (A^A)-^A'^Nf-  {A'^A)-^A^Ab  and  A/=  (A^A) -U^iV/-  (A^ A)-^A^ Ab . 

For  the  residual,  substituting  f into  (29),  and  using  (37),  we  have 

£ = II  (A  + N)f-  A (A^A)  -^A'^Nf-  A (A^A)  “U^Ab  + b + Ab|| 

= ||(/-A(A^A)-iAJ')  (iV/+Ab)||  asin(31). 

To  understand  E better,  we  analyze  matrix  I-A  (A^A)  ~^A^ , denoted  by  K. 
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.2 


Let  a - /iio-^20(/o20’  ^ - ^io(/o20“^oio^iio»  ^ - -^010^200  “^lOc/iiO’ 


K = /-A(A^A)-iA^  = 


1 


a 


^ + c'^ 


a?-  ab  ac 
ab  b'^  be 
ac  be 


(40) 


The  eigenvalues  of  matrix  K are  (0,0,1),  which  means  that  it  maps  any  vector  to  one  single  direc- 
tion, specified  by  the  eigenvector  corresponding  to  the  nontrivial  eigenvalue. 


Appendix.  D 

We  show  that  a high  condition  number  of  matrix  corresponds  to  one  of  the  two  scenarios  in 
Section  3.2.2.  Recalling  (26),  with  Q unitary  (length-preserving  and  relative  angle-preserving), 

we  have  |>.i|  = ||  (/ioo.'^200’-^iio)1  > “<1  A2  + = ||  (^oiO’-^iiO’ Wl'  ^ high  condition 

number  means  either  1.  |}i2|  » |?ij|  or  2.  |?i2|  « . In  case  1,  it  is  apparent  that  the  second  column 

of  matrix  A (/qio’  -^no’  ^ much  larger  than  the  first  column  (/^oo’  ^200’  ^iio^  means 

that  the  derivatives  in  the  direction  of  y are  much  larger  than  the  derivatives  in  the  direction  of 
This  corresponds  to  the  first  scenario  in  3.2.2.  In  case  2,  there  are  two  possibilities:  one  is  that  d is 
also  very  small,  the  other  is  that  d is  comparable  with  . If  both  ^2  ^ small,  we  have  a 

case  opposite  to  case  1.  This  also  corresponds  to  the  first  scenario  in  3.2.2.  If  d is  not  small,  it  indi- 
cates that  the  two  column  vectors  of  matrix  A are  close  to  being  linearly  dependent.  It  means  that 
the  derivatives  in  the  x direction  are  proportional  to  the  derivatives  in  the  y direction.  This  corre- 
sponds to  the  second  scenario  in  3.2.2. 


Appendix.  E 

We  show  that  for  any  2x2  matrix  R,  a small  determinant  relative  to  the  elements  of  the  matrix 
results  in  a high  condition  number,  i.e.  if  \det  (R)  \«\Tr  (R)  | , then  cond(R)  will  be  large. 
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cond {R) 


|X.l 

— , let  Tr(R)  >0  without  loss  of  generality,  then 


cond (R) 


Tr(R)  +jTr(R)^-4detT^ 

Tr  (R)  - jTr{R)^-Adet  {R) 


|rr(i?)  + {Tr(R)  -2det(R))\ 
\Tr{R)  - (Tr(R)  -2det(R))\ 


Tr{R)-det(R)  Tr(R) 
\det{R)\  \det{R)\ 


as  can  be  seen  from  the  assumption  that  the  condition  number  is  very  large. 
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